AITopics | monte carlo counterfactual regret minimization

Collaborating Authors

monte carlo counterfactual regret minimization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Robust Deep Monte Carlo Counterfactual Regret Minimization: Addressing Theoretical Risks in Neural Fictitious Self-Play

Jaafari, Zakaria El

arXiv.org Machine LearningSep-3-2025

Monte Carlo Counterfactual Regret Minimization (MCCFR) has emerged as a cornerstone algorithm for solving extensive-form games, but its integration with deep neural networks introduces scale-dependent challenges that manifest differently across game complexities. This paper presents a comprehensive analysis of how neural MCCFR component effectiveness varies with game scale and proposes an adaptive framework for selective component deployment. We identify that theoretical risks such as nonstationary target distribution shifts, action support collapse, variance explosion, and warm-starting bias have scale-dependent manifestation patterns, requiring different mitigation strategies for small versus large games. Our proposed Robust Deep MCCFR framework incorporates target networks with delayed updates, uniform exploration mixing, variance-aware training objectives, and comprehensive diagnostic monitoring. Through systematic ablation studies on Kuhn and Leduc Poker, we demonstrate scale-dependent component effectiveness and identify critical component interactions. The best configuration achieves final exploitability of 0.0628 on Kuhn Poker, representing a 60% improvement over the classical framework (0.156). On the more complex Leduc Poker domain, selective component usage achieves exploitability of 0.2386, a 23.5% improvement over the classical framework (0.3703) and highlighting the importance of careful component selection over comprehensive mitigation. Our contributions include: (1) a formal theoretical analysis of risks in neural MCCFR, (2) a principled mitigation framework with convergence guarantees, (3) comprehensive multi-scale experimental validation revealing scale-dependent component interactions, and (4) practical guidelines for deployment in larger games.

artificial intelligence, information, machine learning, (14 more...)

arXiv.org Machine Learning

2509.00923

Country: North America > United States > Texas (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Search in Imperfect Information Games

Schmid, Martin

arXiv.org Artificial IntelligenceNov-10-2021

From the very dawn of the field, search with value functions was a fundamental concept of computer games research. Turing's chess algorithm from 1950 was able to think two moves ahead, and Shannon's work on chess from $1950$ includes an extensive section on evaluation functions to be used within a search. Samuel's checkers program from 1959 already combines search and value functions that are learned through self-play and bootstrapping. TD-Gammon improves upon those ideas and uses neural networks to learn those complex value functions -- only to be again used within search. The combination of decision-time search and value functions has been present in the remarkable milestones where computers bested their human counterparts in long standing challenging games -- DeepBlue for Chess and AlphaGo for Go. Until recently, this powerful framework of search aided with (learned) value functions has been limited to perfect information games. As many interesting problems do not provide the agent perfect information of the environment, this was an unfortunate limitation. This thesis introduces the reader to sound search for imperfect information games.

continual re-solving algorithm, monte carlo counterfactual regret minimization, strong abstraction-based poker agent, (16 more...)

arXiv.org Artificial Intelligence

2111.05884

Country:

North America > United States > Texas (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(7 more...)

Genre: Research Report > New Finding (0.92)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Leisure & Entertainment > Games > Chess (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(5 more...)

Add feedback

AI smokes 5 poker champs at a time in no-limit Hold'em with 'relentless consistency' – TechCrunch

#artificialintelligenceJul-12-2019, 06:36:45 GMT

The machines have proven their superiority in one-on-one games like chess and go, and even poker -- but in complex multiplayer versions of the card game humans have retained their edge… until now. An evolution of the last AI agent to flummox poker pros individually is now decisively beating them in championship-style 6-person game. As documented in a paper published in the journal Science today, the CMU/Facebook collaboration they call Pluribus reliably beats five professional poker players in the same game, or one pro pitted against five independent copies of itself. It's a major leap forward in capability for the machines, and amazingly is also far more efficient than previous agents as well. One-on-one poker is a weird game, and not a simple one, but the zero-sum nature of it (whatever you lose, the other player gets) makes it susceptible to certain strategies in which computer able to calculate out far enough can put itself at an advantage.

artificial intelligence, pluribus, social media, (12 more...)

#artificialintelligence

Genre: Research Report (0.89)

Industry: Leisure & Entertainment > Games > Poker (0.49)

Technology:

Information Technology > Communications > Social Media (0.59)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.50)
Information Technology > Artificial Intelligence > Games (0.48)

Add feedback

Search in Imperfect Information Games Using Online Monte Carlo Counterfactual Regret Minimization

Lanctot, Marc (Maastricht University) | Lisy, Viliam (Czech Technical University in Prague) | Bowling, Michael (University of Alberta)

AAAI ConferencesJul-22-2014

Online search in games has always been a core interest of artificial intelligence. Advances made in search for perfect information games (such as Chess, Checkers, Go, and Backgammon) have led to AI capable of defeating the world's top human experts. Search in imperfect information games (such as Poker, Bridge, and Skat) is significantly more challenging due to the complexities introduced by hidden information. In this paper, we present Online Outcome Sampling (OOS), the first imperfect information search algorithm that is guaranteed to converge to an equilibrium strategy in two-player zero-sum games. We show that OOS avoids common problems encountered by existing search algorithms and we experimentally evaluate its convergence rate and practical performance against benchmark strategies in Liar's Dice and a variant of Goofspiel. We show that unlike with Information Set Monte Carlo Tree Search (ISMCTS) the exploitability of the strategies produced by OOS decreases as the amount of search time increases. In practice, OOS performs as well as ISMCTS in head-to-head play while producing strategies with lower exploitability given the same search time.

artificial intelligence, game theory, monte carlo counterfactual regret minimization, (1 more...)

AAAI Conferences

Workshops at the Twenty-Eighth AAAI Conference on Artificial Intelligence

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.73)

Add feedback